Fix integration test failing on first Dependabot PR run by lym953 · Pull Request #692 · DataDog/serverless-plugin-datadog

lym953 · 2026-04-10T21:38:44Z

The entire PR (including the summary below) was generated by Claude Code.

Problem

The integration-tests CI job fails on the first run of every Dependabot PR with:

Error:
The security token included in the request is invalid.

This happens during sls package for serverless-extension.yml, but not for serverless-forwarder.yml. Dependabot PRs run without valid AWS credentials, which causes sls package to fail when it tries to call AWS.

Root cause

When serverless.yml defines a top-level layers: section, Serverless Framework v3 calls CloudFormation.describeStacks for each layer in compareWithLastLayer() (in lib/plugins/aws/package/compile/layers.js). This is an optimisation that checks whether a layer has already been uploaded to avoid re-uploading unchanged ones. The error handler in that function only silences errors whose message contains "does not exist" — any other AWS error is re-thrown:

(e) => {
    if (e.message.includes('does not exist')) {
        return;
    }
    throw e;  // re-throws "security token invalid" → sls package fails
}

The forwarder test is immune because our plugin wraps all CloudWatch calls in a broad catch that returns [], silently absorbing credential errors. The extension test has no such safety net.

Fix

Add a lightweight test-only Serverless plugin (integration_tests/offline.js) that hooks into before:package:compileLayers and intercepts CloudFormation.describeStacks calls, immediately rejecting them with a synthetic "does not exist" error. This makes compareWithLastLayer treat every snapshot test run as a fresh stack, which is the correct behaviour for snapshot tests where we never actually deploy.

serverless-extension.yml is updated to load this plugin.

How I tested it

Confirmed the root cause by running sls package locally with placeholder invalid credentials against both configs:

# Extension config (before fix) → EXIT: 1
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE AWS_SECRET_ACCESS_KEY=... serverless package --config serverless-extension.yml

# Forwarder config → EXIT: 0 (errors swallowed by plugin)
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE AWS_SECRET_ACCESS_KEY=... serverless package --config serverless-forwarder.yml

Verified the fix by running the same command after adding offline.js:

# Extension config (after fix) → EXIT: 0, "Service packaged"
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE AWS_SECRET_ACCESS_KEY=... serverless package --config serverless-extension.yml

Verified snapshot is unchanged by applying all the normalisation steps from run_integration_tests.sh to the generated template and diffing against correct_extension_snapshot.json — no differences.

Cross-referenced with PR history: PR #666 (a previous Dependabot PR) had the identical failure on its first run and was fixed only by pushing a run ci commit to trigger a fresh CI run, confirming this is a consistent first-run issue rather than a one-off flake.

When custom Lambda layers are defined, Serverless Framework calls CloudFormation.describeStacks in compareWithLastLayer() to check whether the layer has already been uploaded. Its error handler only silences "does not exist" responses; any other error – including the "security token invalid" response AWS returns when credentials are absent – is re-thrown, failing sls package. Add a lightweight test-only Serverless plugin (offline.js) that intercepts CloudFormation.describeStacks during packaging and returns a synthetic "does not exist" error. This makes compareWithLastLayer treat every run as a fresh stack, which is the correct behaviour for snapshot tests where we never actually deploy. The snapshot output is unchanged because S3 key timestamps are already normalised. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

codecov-commenter · 2026-04-10T21:40:07Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.69%. Comparing base (a812da7) to head (6be95f7).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #692   +/-   ##
=======================================
  Coverage   77.69%   77.69%           
=======================================
  Files          12       12           
  Lines        1112     1112           
  Branches      350      350           
=======================================
  Hits          864      864           
  Misses        118      118           
  Partials      130      130

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

ava-silver

The fix is correct and well-explained — the root cause analysis is solid and the offline.js shim does what it needs to do. One thing worth doing alongside this (or in a follow-up) is a naming pass, because the mislabeling is what makes this shim look suspicious in the first place.

These tests never deploy anything — they run sls package, generate a CloudFormation template, and diff it against a checked-in snapshot. That's a snapshot test, not an integration test. Calling them "integration tests" implies real AWS connectivity is expected, which is exactly why the credential failure looked like a bug rather than an obvious non-issue.

The naming is scattered across several places that would all need updating together:

integration_tests/ directory → snapshot_tests/ (or packaging_tests/)
integration_tests/run_integration_tests.sh → snapshot_tests/run_snapshot_tests.sh
integration_tests/serverless-extension.yml / serverless-forwarder.yml → move accordingly
The CI workflow file and its job name — both currently called integration-tests in the Actions runs
The integrationTesting / testingMode plugin option in src/env.ts is a separate concept (it bypasses forwarder ARN validation), so that one is fine to leave alone

For context on why there's no cleaner upstream fix: --noDeploy was removed in Serverless Framework v2.36.0 and sls package is the recommended replacement — which is exactly what these tests already use. The compareWithLastLayer() credential issue has been an open bug since serverless/serverless#8187 (2020), and there's an open feature request for a --artifacts-only flag at serverless/serverless#12969 (Dec 2024) with no movement. So the shim is the right call for now — it just deserves a comment pointing at those issues so future maintainers know this isn't accidental and know when it can be removed.

Co-authored-by: Ava Silver <ava.silver@datadoghq.com>

lym953 marked this pull request as ready for review April 10, 2026 21:45

lym953 requested a review from a team as a code owner April 10, 2026 21:45

lym953 requested a review from TalUsvyatsky April 10, 2026 21:45

ava-silver approved these changes Apr 13, 2026

View reviewed changes

Comment thread integration_tests/offline.js

Update integration_tests/offline.js

6be95f7

Co-authored-by: Ava Silver <ava.silver@datadoghq.com>

gh-worker-dd-devflow-36fce6 bot added mergequeue-status: waiting mergequeue-status: queued mergequeue-status: in_progress and removed mergequeue-status: waiting mergequeue-status: queued labels Apr 13, 2026

gh-worker-dd-mergequeue-cf854d bot merged commit fccf074 into main Apr 13, 2026
8 checks passed

gh-worker-dd-devflow-36fce6 bot removed the mergequeue-status: in_progress label Apr 13, 2026

gh-worker-dd-mergequeue-cf854d bot deleted the yiming.luo/fix-integration-test-first-run branch April 13, 2026 18:22

gh-worker-dd-devflow-36fce6 bot added the mergequeue-status: done label Apr 13, 2026

ava-silver mentioned this pull request Apr 14, 2026

chore: remove monitors test from pr workflow #695

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix integration test failing on first Dependabot PR run#692

Fix integration test failing on first Dependabot PR run#692
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits intomainfrom
yiming.luo/fix-integration-test-first-run

lym953 commented Apr 10, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Apr 10, 2026 •

edited

Loading

Uh oh!

ava-silver left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lym953 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root cause

Fix

How I tested it

Uh oh!

codecov-commenter commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ava-silver left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lym953 commented Apr 10, 2026 •

edited

Loading

codecov-commenter commented Apr 10, 2026 •

edited

Loading